Parameter tying and gaussian clustering for faster, better, and smaller speech recognition

نویسندگان

Ananth Sankar

Venkata Ramana Rao Gadde

چکیده

We present a new view of hidden Markov model (HMM) state tying, showing that the accuracy of phonetically tied mixture (PTM) models is similar to, or better than, that of the more typical stateclustered HMM systems. The PTM models require fewer Gaussian distance computations during recognition, and can lead to recognition speedups. We describe a per-phone Gaussian clustering algorithm that automatically determines the number of Gaussians for each phone in the PTM model. Experimental results show that this method gives a substantial decrease in the number of Gaussians and a corresponding speedup with little degradation in accuracy. Finally, we study mixture weight thresholding algorithms to drastically decrease the number of mixture weights in the PTM model without degrading accuracy. More than a factor of 10 reduction in mixture weights is achieved with no degradation in performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Parameter tying for flexible speech recognition

This paper presents two parameter tying techniques which enable a trade-off between computational cost and recognition performances of a speaker independent flexible speech recognition system working over the telephone network. Parameter tying is conducted at phonetic and acoustic levels. At the phonetic level, allophone and triphone based phonetic modeling are used simultaneously to achieve th...

متن کامل

Flexible Parameter Tying for Conversational Speech Recognition

Modeling pronunciation variation is key for recognizing conversational speech. Previous efforts on pronunciation modeling by modifying dictionaries only yielded marginal improvement. Due to complex interaction between dictionaries and acoustic models, we believe a pronunciation modeling scheme is plausible only when closely coupled with the underlying acoustic model. This paper explores the use...

متن کامل

A new look at HMM parameter tying for large vocabulary speech recognition

Most current state-of-the-art large-vocabulary continuous speech recognition (LVCSR) systems are based on state-clustered hidden Markov models (HMMs). Typical systems use thousands of state clusters, each represented by a Gaussian mixture model with a few tens of Gaussians. In this paper, we show that models with far more parameter tying, like phonetically tied mixture (PTM) models, give better...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Parameter tying and gaussian clustering for faster, better, and smaller speech recognition

نویسندگان

چکیده

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Parameter tying for flexible speech recognition

Flexible Parameter Tying for Conversational Speech Recognition

A new look at HMM parameter tying for large vocabulary speech recognition

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

عنوان ژورنال:

اشتراک گذاری